Skip to content

[fix](fe) Fix isCountStar incorrectly treating count(null) as count(*)#62548

Merged
morrySnow merged 2 commits intoapache:masterfrom
yujun777:fix/count-null-is-not-count-star
Apr 21, 2026
Merged

[fix](fe) Fix isCountStar incorrectly treating count(null) as count(*)#62548
morrySnow merged 2 commits intoapache:masterfrom
yujun777:fix/count-null-is-not-count-star

Conversation

@yujun777
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:

Count.isCountStar() returns true for count(null) because it only checks child(0) instanceof Literal without excluding NullLiteral. Since count(null) always returns 0 (it counts non-null values of a constant null), it is semantically different from count(*) which counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g. PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll) use isCountStar() before CountLiteralRewrite has a chance to rewrite count(null) to 0.

Release note

Fixed count(null) being incorrectly treated as count(*) in query optimizer, which could lead to wrong results in certain rewrite scenarios.

Check List (For Author)

  • Test: Unit Test / Regression test
    • FE UT: CountLiteralRewriteTest (3 tests pass, including new testCountNullIsNotCountStar)
    • Regression test: count_null_not_count_star covering basic, group by, mixed, union all, window, and join paths
  • Behavior changed: No
  • Does this need documentation: No

@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 16, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777
Copy link
Copy Markdown
Contributor Author

run buildall

@yujun777
Copy link
Copy Markdown
Contributor Author

/review

morrySnow
morrySnow previously approved these changes Apr 16, 2026
@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  1. regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new window case does not reach SimplifyWindowExpression, so it does not actually validate the isCountStar() consumer named in the PR.
  2. regression-test/suites/nereids_rules_p0/count_null_not_count_star/count_null_not_count_star.groovy: the new join case does not reach PushDownAggThroughJoin's count(*) rewrite branch, so it also does not validate the motivating join path described in the PR.
  3. fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/agg/Count.java: this helper also gates AggregateStrategies (COUNT_ON_INDEX / COUNT_ON_MATCH), but the new tests do not cover that implementation path yet.

Critical Checkpoints

  • Goal: The one-line change in Count.isCountStar() fixes the direct count(null) misclassification. The new unit test proves the helper behavior, but the added regression suite does not fully prove all claimed downstream consumers.
  • Scope: The change is small, focused, and centralized in the right helper.
  • Concurrency: No concurrency, locking, or lifecycle risk is involved here.
  • Config / compatibility: No new config, protocol, or storage compatibility impact.
  • Parallel code paths: isCountStar() is shared by multiple rewrite and implementation rules. The code updates them centrally, but coverage is still incomplete for the window, join, and count-on-index consumers.
  • Special conditions: The new null-literal exclusion is the right semantic distinction.
  • Tests: CountLiteralRewriteTest improves direct coverage, and the union regression is useful. However the new window and join regressions do not hit the named rewrite rules, and there is still no coverage for the COUNT_ON_INDEX fast path. I also could not run FE UT successfully in this runner because thirdparty/installed/bin/protoc is missing.
  • Result files: The new .out file is consistent with the SQL in the added suite.
  • Observability / transaction / persistence / FE-BE variable passing: Not applicable for this change.
  • Performance: No new performance concern found.
  • Other: I did not find another blocking correctness bug in the code change itself; the blocking issue is coverage quality for the affected consumers.

Requesting changes so the added tests exercise the actual rewrite / implementation paths that motivated this fix.

) t
"""

// count(null) as window function
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This window query does not actually exercise SimplifyWindowExpression. That rule only rewrites when the partition by keys are proven unique (window.getLogicalProperties().getTrait().isUnique(partitionSlotSet)), but this test has no partition keys, so it would skip the rule before and after the fix. Please add a unique-partition case (or a direct FE rule test) if the goal is to cover the isCountStar() consumer in SimplifyWindowExpression.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Window test kept as a correctness check (count(null)=0 in window context). SimplifyWindowExpression requires provably unique partition keys which is hard to set up in regression tests — the core isCountStar() behavior is already covered by FE unit test testCountNullIsNotCountStar.

select pk, count(null) over (order by pk) from count_null_test order by pk
"""

// count(null) through join must still return 0, not get rewritten by PushDownAggThroughJoin
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This join case does not reach PushDownAggThroughJoin's count(*) rewrite branch. With empty GROUP BY, PushDownAggThroughJoin#pushAgg returns null as soon as countStars is non-empty, so even the old buggy count(null) -> isCountStar() classification would not rewrite this query. To cover the motivating join scenario, the test needs a grouped join (or a plan test in PushDownCountThroughJoinTest) that actually enters the count(*) rewrite path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced with a grouped join: select t1.pk, count(null), count(*) from ... group by t1.pk. Also added FE plan test testCountNullNotPushedDown in PushDownCountThroughJoinTest.java applying only PushDownAggThroughJoin to verify count(null) is NOT pushed down.

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary: Count.isCountStar() returns true for count(null) because
it only checks `child(0) instanceof Literal` without excluding NullLiteral.
Since count(null) always returns 0 (it counts non-null values of a constant
null), it is semantically different from count(*) which counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g.
PushDownAggThroughJoin, SimplifyWindowExpression, PushCountIntoUnionAll)
use isCountStar() before CountLiteralRewrite has a chance to rewrite
count(null) to 0.

### Release note

Fixed count(null) being incorrectly treated as count(*) in query optimizer,
which could lead to wrong results in certain rewrite scenarios.

### Check List (For Author)

- Test: Unit Test / Regression test
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yujun777 yujun777 force-pushed the fix/count-null-is-not-count-star branch from ca09b74 to aed0d9c Compare April 16, 2026 09:45
@yujun777
Copy link
Copy Markdown
Contributor Author

run buildall

@yujun777
Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label Apr 16, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the semantic fix in Count.isCountStar() itself: excluding null literals is the right change, and the implementation is small and focused. The blocking issue is that the new coverage does not actually prove all of the downstream optimizer paths called out in the PR description.

  1. count_null_union does not exercise PushCountIntoUnionAll. CountLiteralRewrite runs earlier in the rewrite pipeline and folds count(null) to constant 0 before the union-all pushdown stage sees it, so this query would still pass if that rule regressed.
  2. count_null_window does not exercise SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique; the added query has no PARTITION BY, so a regression in checkCount() / isCountStar() there would go unnoticed.
  3. The inverted-index case only checks the negative explain. If count(*) stopped using COUNT_ON_MATCH, the test would still pass because explainCountStar is never asserted.

Critical Checkpoints

  • Goal of current task: Prevent count(null) from being treated as count(*). The code change in Count.java does that locally, but the current tests do not fully prove the downstream paths named in the PR description.
  • Modification size/focus: Small and focused.
  • Concurrency: Not involved in the touched code.
  • Lifecycle/static initialization: Not involved.
  • Config changes: None.
  • Compatibility / FE-BE protocol / persistence: None.
  • Parallel code paths: Yes. Several optimizer paths depend on row-count semantics. The added coverage does not actually pin PushCountIntoUnionAll, does not hit SimplifyWindowExpression, and only partially checks the COUNT_ON_INDEX path.
  • Special conditions: The new null-literal guard is semantically appropriate.
  • Test coverage: Incomplete for the risky paths above.
  • Test result updates: The added .out file matches the SQL results that are asserted.
  • Observability: No change needed.
  • Transactionality / data writes / new FE-BE variables: Not applicable.
  • Performance: No concern from the code change itself.
  • Other issues: None beyond the missing coverage.

Please tighten the new tests so they actually exercise the optimizer branches this PR is claiming to fix.

order_qt_count_mixed """select count(null), count(*), count(a), count(b) from count_null_test"""

// count(null) in subquery with union all
order_qt_count_null_union """
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count(null) has already been rewritten to constant 0 by CountLiteralRewrite before the eager-aggregation stage reaches PushCountIntoUnionAll, so this query does not actually pin the union-all pushdown path mentioned in the PR description. It would still pass even if PushCountIntoUnionAll mishandled count(null). Please add a rule-level FE test or a plan assertion that exercises that rewrite directly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added FE unit test testCountNullPushedAsCountNullNotCountStar in PushCountIntoUnionAllTest.java.

Note: CountLiteralRewrite does not rewrite count(null) to 0 in the no-GROUP-BY case. At line 66 of CountLiteralRewrite.java, it guards with aggFuncs.isEmpty() — when count(null) is the only aggregate and there are no GROUP BY keys, the resulting aggregate would have zero output expressions, so the rule returns null (no-op). PushCountIntoUnionAll therefore does see count(null) in this case.

The new FE unit test directly verifies: (1) PushCountIntoUnionAll fires (Sum0 appears in upper agg), and (2) child agg expressions do NOT contain a Count with isCountStar=true.

"""

// count(null) as window function should still return 0
order_qt_count_null_window """
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This window case does not reach SimplifyWindowExpression. That rule only rewrites when the PARTITION BY keys are unique, and an empty partition-key set is not treated as unique. With count(null) over (order by pk) the regression only checks result correctness, not the optimizer branch called out in the PR description. Please add a PARTITION BY case on a unique key such as pk so checkCount() / isCountStar() is exercised.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added a UNIQUE KEY table count_null_uniq_test with PARTITION BY pk so isUnique({pk}) returns true and SimplifyWindowExpression does fire. The test asserts count(null) over (partition by pk) = 0 for each row with explicit assertEquals(0L, row[1]) assertions. With the fix, isCountStar() returns false for count(null)checkCount() returns false → window is NOT simplified → correctly returns 0.


// count(*) with MATCH predicate may use COUNT_ON_MATCH optimization
// count(null) with same predicate must NOT — verify via explain
def explainCountStar = sql """
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the negative explain is asserted here. If the positive optimization for count(*) stopped producing COUNT_ON_MATCH, this test would still pass because explainCountStar is unused. Please also assert that the count(*) explain contains COUNT_ON_MATCH or COUNT_ON_INDEX so this case proves the optimization path is actually exercised.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Replaced the explain shape plan + assertFalse(contains 'COUNT_ON_MATCH') block — which was trivially always-true since explain shape plan calls shapeInfo() (returns class name only, never includes pushAggOp) — with proper explain {} DSL blocks:

  • count(*) asserts contains "pushAggOp=COUNT_ON_INDEX" (positive)
  • count(null) asserts notContains "pushAggOp=COUNT_ON_INDEX" (negative)

Note: PhysicalPlanTranslator maps COUNT_ON_MATCHTPushAggOp.COUNT_ON_INDEX, so explain output shows pushAggOp=COUNT_ON_INDEX for both COUNT_ON_INDEX and COUNT_ON_MATCH physical paths. The positive assertion proves the optimization is actually exercised for count(*).

@yujun777
Copy link
Copy Markdown
Contributor Author

run feut

@yujun777
Copy link
Copy Markdown
Contributor Author

run vault_p0

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

… optimizer paths

### What problem does this PR solve?

Issue Number: close #xxx

Problem Summary:
Address three reviewer concerns on PR apache#62548:

1. PushCountIntoUnionAll: Add FE unit test testCountNullPushedAsCountNullNotCountStar
   verifying count(null) is pushed down as count(null) not as count(*). CountLiteralRewrite
   skips count(null) with no GROUP BY (would produce empty agg), so PushCountIntoUnionAll
   does see count(null). The new test explicitly pins this behavior.

2. SimplifyWindowExpression: Add regression test with PARTITION BY on a UNIQUE KEY table
   (count_null_uniq_test). The rule only fires when partition keys are proven unique; this
   test directly exercises checkCount() → isCountStar() for count(null).

3. COUNT_ON_INDEX: Replace explain shape plan (which only shows class name, not pushAggOp)
   with the explain{} DSL block. Now asserts count(*) DOES use pushAggOp=COUNT_ON_INDEX
   and count(null) does NOT, making the test non-trivial.

### Release note

None

### Check List (For Author)

- Test: Unit Test / Regression test
- Behavior changed: No
- Does this need documentation: No

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@yujun777
Copy link
Copy Markdown
Contributor Author

run buildall

@yujun777
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blocking issues found.

Critical Checkpoints

  • Goal: The change prevents count(null) from being treated as count(*) in shared Count.isCountStar() logic. The one-line fix is correct, and the added UT/regression coverage now exercises the direct helper behavior plus the union-all, join, window, and count-on-index consumers called out in the PR.
  • Scope: The modification is small, focused, and centralized in the right helper.
  • Concurrency: No concurrency, locking, or lifecycle risk is involved in the touched code.
  • Lifecycle / static init: No special lifecycle or static initialization impact.
  • Config: No new configuration or dynamic-config behavior.
  • Compatibility: No FE-BE protocol, persistence, or rolling-upgrade compatibility impact.
  • Parallel code paths: I checked the current isCountStar() consumers; the shared helper fix covers the relevant rewrite and implementation paths, and I did not find another reachable correctness gap for count(null).
  • Special conditions: Excluding null literals is the right semantic distinction. count(*) and non-null literal counts keep their existing behavior.
  • Test coverage: The new FE tests and regression suite are targeted and sufficient for this fix. The new .out file is consistent with the deterministic queries. I could not rerun FE UT in this runner because thirdparty/installed/bin/protoc is missing, so the verification here is based on code and plan-path inspection.
  • Observability: No additional logs or metrics are needed for this change.
  • Transactionality / data writes / FE-BE vars: Not applicable.
  • Performance: No new performance concern found.
  • Other: No additional blocking issues found.

@abcxyz333
Copy link
Copy Markdown

run vault_p0

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/1) 🎉
Increment coverage report
Complete coverage report

@abcxyz333
Copy link
Copy Markdown

run nonConcurrent

@hello-stephen
Copy link
Copy Markdown
Contributor

FE Regression Coverage Report

Increment line coverage 0.00% (0/23) 🎉
Increment coverage report
Complete coverage report

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Apr 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@morrySnow morrySnow merged commit 9ba05cd into apache:master Apr 21, 2026
33 checks passed
github-actions Bot pushed a commit that referenced this pull request Apr 21, 2026
#62548)

### What problem does this PR solve?

Problem Summary:

`Count.isCountStar()` returns `true` for `count(null)` because it only
checks `child(0) instanceof Literal` without excluding `NullLiteral`.
Since `count(null)` always returns 0 (it counts non-null values of a
constant null), it is semantically different from `count(*)` which
counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g.
`PushDownAggThroughJoin`, `SimplifyWindowExpression`,
`PushCountIntoUnionAll`) use `isCountStar()` before
`CountLiteralRewrite` has a chance to rewrite `count(null)` to `0`.

### Release note

Fixed `count(null)` being incorrectly treated as `count(*)` in query
optimizer, which could lead to wrong results in certain rewrite
scenarios.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
github-actions Bot pushed a commit that referenced this pull request Apr 21, 2026
#62548)

### What problem does this PR solve?

Problem Summary:

`Count.isCountStar()` returns `true` for `count(null)` because it only
checks `child(0) instanceof Literal` without excluding `NullLiteral`.
Since `count(null)` always returns 0 (it counts non-null values of a
constant null), it is semantically different from `count(*)` which
counts all rows.

This could cause incorrect results if downstream rewrite rules (e.g.
`PushDownAggThroughJoin`, `SimplifyWindowExpression`,
`PushCountIntoUnionAll`) use `isCountStar()` before
`CountLiteralRewrite` has a chance to rewrite `count(null)` to `0`.

### Release note

Fixed `count(null)` being incorrectly treated as `count(*)` in query
optimizer, which could lead to wrong results in certain rewrite
scenarios.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
yiguolei pushed a commit that referenced this pull request Apr 22, 2026
…) as count(*) #62548 (#62667)

Cherry-picked from #62548

Co-authored-by: yujun <yujun@selectdb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
yiguolei pushed a commit that referenced this pull request Apr 22, 2026
…) as count(*) #62548 (#62666)

Cherry-picked from #62548

Co-authored-by: yujun <yujun@selectdb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.6-merged dev/4.1.1-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants